32 research outputs found
Finito: A Faster, Permutable Incremental Gradient Method for Big Data Problems
Recent advances in optimization theory have shown that smooth strongly convex
finite sums can be minimized faster than by treating them as a black box
"batch" problem. In this work we introduce a new method in this class with a
theoretical convergence rate four times faster than existing methods, for sums
with sufficiently many terms. This method is also amendable to a sampling
without replacement scheme that in practice gives further speed-ups. We give
empirical results showing state of the art performance
Optimization of robust loss functions for weakly-labeled image taxonomies: An ImageNet case study
Trabajo presentado a la 8th International Conference EMMCVPR celebrada en St. Petersburg del 25 al 27 de julio de 2011.The recently proposed ImageNet dataset consists of several million images, each annotated with a single object category. However, these annotations may be imperfect, in the sense that many images contain multiple objects belonging to the label vocabulary. In other words, we have a multi-label problem but the annotations include only a single label (and not necessarily the most prominent). Such a setting motivates the use of a robust evaluation measure, which allows for a limited number of labels to be predicted and, as long as one of the predicted labels is correct, the overall prediction should be considered correct. This is indeed the type of evaluation measure used to assess algorithm performance in a recent competition on ImageNet data. Optimizing such types of performance measures presents several hurdles even with existing structured output learning methods. Indeed, many of the current state-of-the-art methods optimize the prediction of only a single output label, ignoring this `structureÂż altogether. In this paper, we show how to directly optimize continuous surrogates of such performance measures using structured output learning techniques with latent variables. We use the output of existing binary classifiers as input features in a new learning stage which optimizes the structured loss corresponding to the robust performance measure. We present empirical evidence that this allows us to `boostÂż the performance of existing binary classifiers which are the state-of-the-art for the task of object classification in ImageNet.Part of this work was carried out when both AR and TC were at INRIA Grenoble,
Rhone-Alpes. NICTA is funded by the Australian Government as represented by the
Department of Broadband, Communications and the Digital Economy and the Australian Research Council through the ICT Centre of Excellence program. This work
was partially funded by the QUAERO project supported by OSEO, French State agency
for innovation and by MICINN under project MIPRCV Consolider Ingenio CSD2007-
00018.Peer Reviewe
Faster graphical models for point-pattern matching
It has been shown that isometric matching problems can be solved exactly in polynomial time, by means of a Junction Tree with small maximal clique size. Recently, an iterative algorithm was presented which converges to the same solution an order of magnitude faster. Here, we build on both of these ideas to produce an algorithm with the same asymptotic running time as the iterative solution, but which requires only a single iteration of belief propagation. Thus our algorithm is much faster in practice, while maintaining similar error rates
Exploiting Data-Independence for Fast Belief-Propagation
Maximum a posteriori (MAP) inference in graphical models requires that we maximize the sum of two terms: a data-dependent term, encoding the conditional likelihood of a certain labeling given an observation, and a data-independent term, encoding some prior on labelings. Often, data-dependent factors contain fewer latent variables than dataindependent factors – for instance, many grid and tree-structured models contain only firstorder conditionals despite having pairwise priors. In this paper, we note that MAPinference in such models can be made substantially faster by appropriately preprocessing their data-independent terms. Our main result is to show that message-passing in any such pairwise model has an expected-case exponent of only 1.5 on the number of states per node, leading to significant improvements over existing quadratic-time solutions. 1